Rapid Estimates of Statistical Significance of the Pairwise Nucleotide Sequence Alignment
نویسندگان
چکیده
Statistical significance of the similarity observed is the main question while comparing sequences. This problem has not yet been solved mathematically for optimal aligning of the sequences containing insertions and deletions. We have carried out the regression analysis of the observed similarity of random sequences depending on their length and nucleotide composition and are proposing a practical method to estimate the probability of the similarity observed to be statistically significant. The regression parameters being determined for a given alignment scheme (similarity matrix and penalties for deletions) for a pair of nucleotide sequences, the statistical significance of the similarity observed can be precisely estimated basing on only their lengths and nucleotide composition.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملFPGA architecture for pairwise statistical significance estimation
Sequence comparison is one of the most fundamental computational problems in bioinformatics. Pairwise sequence alignment methods align two sequences using a substitution matrix consisting of pairwise scores of aligning different residues with each other (like BLOSUM62), and give an alignment score for the given sequence-pair. This work 1 addresses the problem of accurately estimating statistica...
متن کاملFundamentals of massive automatic pairwise alignments of protein sequences: theoretical significance of Z-value statistics
MOTIVATION Different automatic methods of sequence alignments are routinely used as a starting point for homology searches and function inference. Confidence in an alignment probability is one of the major fundamentals of massive automatic genome-scale pairwise comparisons, for clustering of putative orthologs and paralogs, sequenced genome annotation or multiple-genomic tree constructions. Ext...
متن کاملEnhancing Parallelism of Pairwise Statistical Significance Estimation for Local Sequence Alignment
Pairwise statistical significance (PSS) has been found to be able to accurately identify related sequences (homology detection), which is a fundamental step in numerous applications relating to sequence analysis. Although more accurate than database statistical significance, it is both computationally intensive and data intensive to construct the empirical score distribution during the estimati...
متن کاملRapid and accurate estimates of statistical significance for sequence data base searches.
A central question in sequence comparison is the statistical significance of an observed similarity. For local alignment containing gaps to optimize sequence similarity this problem has so far not been solved mathematically. Using as a basis the Chen-Stein theory of Poisson approximation, we present a practical method to approximate the probability that a local alignment score is a result of ch...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001